x        Contents

7.3 DATA ANALYSIS WITH QIIME2

263

7.3.1

QIIME2 Input Files

265

7.3.1.1 Importing Sequence Data

265

7.3.1.2 Metadata

269

7.3.2

Demultiplexing

269

7.3.3

Downloading and Preparing the Example Data

271

7.3.3.1 Downloading the Raw Data

271

7.3.3.2 Creating the Sample Metadata File

272

7.3.3.3 Importing Microbiome Yoga Data

274

7.3.4

Raw Data Preprocessing

275

7.3.4.1 Quality Assessment and Quality Control

275

7.3.4.2 Clustering and Denoising

278

7.3.5

Taxonomic Assignment with QIIME2

289

7.3.5.1 Using Alignment-Based Classifiers

289

7.3.5.2 Using Machine Learning Classifiers

291

7.3.6

Construction of Phylogenetic Tree

297

7.3.6.1 De Novo Phylogenetic Tree

297

7.3.6.2 Fragment-Insertion Phylogenetic Tree

298

7.3.7

Alpha and Beta Diversity Analysis

298

7.4 SUMMARY

300

REFERENCES

301

Chapter 8        Shotgun Metagenomic Data Analysis

303

8.1 INTRODUCTION

303

8.2 SHOTGUN METAGENOMIC ANALYSIS WORKFLOW

305

8.2.1

Data Acquisition

305

8.2.2

Quality Assessment and Processing

305

8.2.3

Removing Host DNA Reads

306

8.2.3.1 Download Human Reference Genome

306

8.2.3.2 Mapping Reads to the Reference Genome

307

8.2.3.3 Converting SAM to BAM Format

307

8.2.3.4 Separating Metagenomic Reads in BAM Files

307

8.2.3.5 Creating Paired-End FASTQ Files from BAM Files

308

8.2.4

Assembly-Free Taxonomic Profiling

310

8.2.4

Assembly of Metagenomes

315